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Abstract — We consider Linear Programming (LP) decoding of 
a fixed Low-Density Parity-Cliecli (LDPC) code over the Binary 
Symmetric Cliannel (BSC). Tlie LP decoder fails wlien it outputs 
a pseudo-codeword wliicli is not a codeword. We design an 
efficient algoritlim termed tlie Instanton Searcli Algoritlim (ISA) 
whicli, given a random input, generates a set of flips calied tlie 
BSC-instanton. We prove ttiat: (a) tfie LP decoder fails for any 
set of Hips with support vector including an instanton; (b) for 
any input, the algorithm outputs an instanton in the number 
of steps upper-bounded by twice the number of Hips in the 
input. Repeated sufficient number of times, the ISA outcomes 
the number of unique instantons of different sizes. 

Index Terms — Low-density parity-check codes. Linear Pro- 
gramming Decoding, Binary Symmetric Channel, Pseudo- 
Codewords, Error-floor 



I. Introduction 

The significance of Low-Density Parity-Check (LDPC) 
codes [1] is in their capacity-approaching performance when 
decoded using low complexity iterative algorithms, such as 
Belief Propagation (BP) [1], [2]. Properly chosen sequence of 
LDPC codes can be made asymptotically good, i.e. iterative 
decoding guarantees exponential decay of error probability in 
the code length n when the noise is below a finite threshold. 
Iterative decoders operate by passing messages along the edges 
of a graphical representation of a code known as the Tanner 
graph [3], and are optimal when the underlying graph is a tree. 
However, the decoding becomes sub-optimal in the presence 
of cycles, and hence the above threshold statement is of a 
limited practical use for the analysis of a fixed code. The linear 
programming (LP) decoding introduced by Feldman et al. [4], 
is another sub-optimal algorithm for decoding LDPC codes, 
which has higher complexity but is more amenable to analysis. 

The typical performance measures of a decoder (either LP 
or BP) for a fixed code are the Bit-Error-Rate (BER) or/and 
the Frame-Error-Rate (FER) as functions of the Signal-to- 
Noise Ratio (SNR). A typical BER/FER vs SNR curve consists 
of two distinct regions. At small SNR, the error probability 
decreases rapidly with the SNR, and the curve forms the so- 
called water-fall region. The decrease slows down at moderate 
values turning into the error-floor asymptotic at very large 
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SNR [5]. This transient behavior and the error-floor asymptotic 
originate from the sub-optimality of the decoding, i.e., the 
ideal maximum-likelihood (ML) curve would not show such 
a dramatic change in the BER/FER with the SNR increase. 

After the formulation of the problem by Richardson [5], a 
significant effort has been devoted to the analysis of the error 
floor phenomenon. Given that the decoding sub-optimality is 
expressed in the domain where the error probability is small, 
the troublesome noise configurations leading to decoding fail- 
ures and controlling the error-floor asymptotic are extremely 
rare, and analytical rather than simulation methods for their 
characterization are necessary. It is worth noting here that most 
of the analytical methods developed in the theory of iterative 
decoding have focused on ensembles of codes rather than a 
given fixed code. 

The failures of iterative decoding over the binary erasure 
channel (BEC) are well understood in terms of combinatorial 
objects known as stopping sets [6]. For iterative decoding 
on the Additive White Gaussian Noise (AWGN) channel and 
the BSC, the decoding failures have been characterized in 
terms of trapping sets [5], [7] and pseudo-codewords [8], [9], 
[10]. Richardson [5] introduced the notion of trapping sets 
and proposed a semi-analytical method to estimate the FER 
performance of a given code on the AWGN channel in the 
error floor region. The method was successfully applied to hard 
decision decoding over the BSC in [7]. The approach of [5] 
was further refined by Stepanov et al. [11], using instantons. 
Pseudo-codewords were first discussed in the context of itera- 
tive decoders using computation trees [8] and later using graph 
covers [9], [10]. Pseudo-codeword distributions were found 
for the special cases of codes from Euclidean and projective 
planes [12]. A detailed analysis of the pseudo-codewords was 
presented by Kelley and Sridhara [13], who discussed the 
bounds on pseudo-codeword size in terms of the girth and 
the minimum left-degree of the underlying Tanner graph. The 
bounds were further investigated by Xia and Fu [14]. Pseudo- 
codeword analysis has also been extended to the convolutional 
LDPC codes by Smarandache et al. [15]. (See also [16] for 
an exhaustive list of references for this and related subjects.) 

Pseudo-codewords can be also used to understand the fail- 
ures of the LP decoder [4]. The pseudo-codewords for the 
LP decoder are equivalent to stopping sets for the case of the 
BEC. For the AWGN channel, the pseudo-codewords of the 
LP decoder are related to the pseudo-codewords arising from 
graph covers [10]. In fact, in [10] Vontobel and Koetter have 
also pointed out relations between pseudo-codewords arising 
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from graph covers and trapping sets. 

Closely related to the pseudo-codewords and the trapping 
sets are the noise configurations that lead to decoding failures 
which are termed as instantons [11]. Finding the instantons is 
a difficult task which so far admitted only heuristic solutions 
[7], [17]. In this regard, the most successful (in efficiency) 
approach, coined the Pseudo-Codeword-Search (PCS) algo- 
rithm, was suggested for the LP decoding performing over 
the continuous channel in [18] (with Additive White Gaussian 
Noise (AWGN) channel used as an enabling example). Given 
a sufficiently strong random input, the outcome of the PCS 
algorithm is an instanton. The resulting distribution of the 
instantons (or respective pseudo-codewords) thus provides a 
compact and algorithmically feasible characterization of the 
AWGN-LP performance of the given code. 

In this paper, we consider pseudo-codewords and instantons 
of the LP decoder for the BSC. We define the BSC-instanton 
as a noise configuration which the LP decoder decodes into 
a pseudo-codeword distinct from the all-zero-codeword while 
any reduction of the (number of flips in) BSC-instanton leads 
to the all-zero-codeword. Being a close relative of the BP 
decoder (see [19], [20] for discussions of different aspects 
of this relation), the LP decoder appeals due to the following 
benefits: (a) it has ML certificate i.e., if the output of the 
decoder is a codeword, then the ML decoder is also guaranteed 
to decode into the same codeword; (b) the output of the LP 
decoder is discrete even if the channel noise is continuous 
(meaning that problems with numerical accuracy do not arise); 

(c) its analysis is simpler due to the readily available set of 
powerful analytical tools from the optimization theory; and 

(d) it allows systematic sequential improvement, which results 
in decoder flexibility and feasibility of an LP-based ML for 
moderately large codes [21], [22]. While slower decoding 
speed is usually cited as a disadvantage of the LP decoder, 
this potential problem can be significantly reduced, thanks to 
the recent progress in smart sequential use of LP constraints 
[23] and/or appropriate graphical transformations [22], [24], 
[25]. 

The two main contributions of this paper are: (1) character- 
ization of all the failures of the LP decoder over the BSC in 
terms of the instantons, and (2) a provably efficient Instanton 
Search Algorithm (ISA). Following the idea by Chertkov and 
Stepanov [18], for a given a random binary n-tuple, the ISA 
generates a BSC-instanton, that is guaranteed to be decoded 
by the LP decoder into a pseudo-codeword distinct from the 
all-zero-codeword. Our ISA constitutes a significantly stronger 
algorithm than the one of [18] due to its property that it outputs 
an instanton in the number of steps upper-bounded by twice 
the number of flips in the original configuration the algorithm 
is initiated with. 

The rest of the paper is organized as follows. In Section II, 
we give a brief introduction to the LDPC codes, LP decoding 
and pseudo-codewords. In Section III, we introduce the BSC- 
specific notions of the pseudo-codeword weight, medians and 
instantons (defined as special set of flips), their costs, and 
we also prove some set of useful lemmata emphasizing the 
significance of the instanton analysis. In Section IV, we 



describe the ISA and prove our main result concerning bounds 
on the number of iterations required to output an instanton. We 
present the ISA test, as applied to the [155, 64, 20] Tanner code 
[26], in Section V. We summarize our results and conclude by 
listing some open problems in Section VI. 

II. Preliminaries: LDPC Codes, LP Decoder and 
Pseudo-Codewords 

In this Section, we discuss the LP decoder and the notion 
of pseudo-codewords. We adopt the formulation of the LP 
decoder and the terminology from [4], and thus the interested 
reader is advised to refer to [4] for more details. 

Let C be a binary LDPC code defined by a Tanner graph 
G with two sets of nodes: the set of variable nodes V = 
{1,2,..., n} and the set of check nodes C = {1, 2, . . . , m}. 
The adjacency matrix of G is iJ, a parity-check matrix of 
C, with m rows corresponding to the check nodes and n 
columns corresponding to the variable nodes. A binary vector 
c = (ci, . . . , c„) is a codeword iff cH^ = 0. The support of 
a vector r = (ri, r2, . . . , r„), denoted by supp(r), is defined 
as the set of all positions i such that ^ 0. 

We assume that a codeword y is transmitted over a discrete 
symmetric memoryless channel and is received as y. The chan- 
nel is characterized by Pr[yj;|j/j;] which denotes the probability 
that Tji is received as jji. The negative log-likelihood ratio 
(LLR) corresponding to the variable node i is given by 

'Pr(y,|2/, = 0)^ 



log 



Pr(y,|2/, = 1)^ 

The ML decoding of the code C allows a convenient LP formu- 
lation in terms of the codeword polytope poly(C) whose ver- 
tices correspond to the codewords in C. The ML-LP decoder 
finds f = (/i, . . . , /„) minimizing the cost function X)"^]^ ^ifi 
subject to the f e poly(C) constraint. The formulation is 
compact but impractical because of the number of constraints 
exponential in the code length. 

Hence a relaxed polytope is defined as the intersection of 
all the polytopes associated with the local codes introduced for 
all the checks of the original code. Associating (/i, . . . , /„) 
with bits of the code we require 



< < 1, Vi e y 



(1) 



For every check node j, let N{i) denote the set of variable 
nodes which are neighbors of j. Let Ej = {T C N{i) : 
|r| is even}. The polytope Qj associated with the check node 
i is defined as the set of points (f , w) for which the following 
constraints hold 



< Wj,T < 1, 
fi = '^TeEi,T3i'^j,T7 



(2) 
(3) 
(4) 



Now, let Q = rijQj be the set of points (f, w) such that (1)- 
(4) hold for all j G C. (Note that Q, which is also referred 
to as the fundamental polytope [9], [10], is a function of the 
Tanner graph G and consequently the parity-check matrix H 
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representing the code C.) The Linear Code Linear Program 
(LCLP) can be stated as 



mm 

(f,w) 



^7,/,, s.t. (f,w) e Q. 



For the sake of brevity, the decoder based on the LCLP is 
referred to in the following as the LP decoder. A solution 
(f , w) to the LCLP such that all /^s and wj^t^ are integers is 
known as an integer solution. The integer solution represents 
a codeword [4]. It was also shown in [4] that the LP decoder 
has the ML certificate, i.e., if the output of the decoder is a 
codeword, then the ML decoder would decode into the same 
codeword. The LCLP can fail, generating an output which is 
not a codeword. 

The performance of the LP decoder can be analyzed in terms 
of the pseudo-codewords, originally defined as follows: 

Definition 1: [4] Integer pseudo-codeword is a vector p = 
of non-negative integers such that, for every 
parity check j G C, the neighborhood {pi : i £ is 
a sum of local codewords. 

Alternatively, one may choose to define a re-scaled pseudo- 
codeword, p {pi, . . . ,pn) where < Pi < l,Vi S V, 
simply equal to the output of the LCLP. In the following, we 
adopt the re-scaled definition. 

A given code C can have different Tanner graph repre- 
sentations and consequently potentially different fundamental 
polytopes. Hence, we refer to the pseudo-codewords as corre- 
sponding to a particular Tanner graph G of C 

It is also appropriate to mention here that the LCLP can be 
viewed as the zero temperature version of BP-decoder looking 
for the global minimum of the so-called Bethe free energy 
functional [19]. 

III. Cost and Weight of Pseudo-codewords, 
Medians and Instantons 

Since the focus of the paper is on the pseudo-codewords 
for the BSC, in this Section we introduce some terms, e.g. 
instantons and medians, specific to the BSC. We will also 
prove here some preliminary lemmata which will enable 
subsequent discussion of the ISA in the next Section. 

The polytope Q is symmetric and looks exactly the same 
from all codewords (see e.g. [4]). Hence we assume that the 
all-zero-codeword is transmitted. The process of changing a bit 
from to 1 and vice-versa is known as flipping. The BSC flips 
every transmitted bit with a certain probability. We therefore 
call a noise vector with support of size k as having k flips. 

In the case of the BSC, the likelihoods are scaled as 



1, if y, = 0; 
-1, if y, = 1. 



Two important characteristics of a pseudo-codeword are its 
cost and weight. While the cost associated with decoding to 
a pseudo-codeword has already been defined in general, we 
formalize it for the case of the BSC as follows: 



Definition 2: The cost associated with LP decoding of a 
binary vector r to a pseudo-codeword p is given by 

C(r,p)= E P^- 

i^SUpp(r) ieSUpp(r) 

If r is the input, then the LP decoder converges to the pseudo- 
codeword p which has the least value of C(r, p). The cost of 
decoding to the all-zero-codeword is zero. Hence, a binary 
vector r does not converge to the all-zero-codeword if there 
exists a pseudo-codeword p with C(r, p) < 0. 

Definition 3: [13, Definition 2.10] Let p ^ {pi, . . . ,pn) be 
a pseudo-codeword distinct from the all-zero-codeword. Let 
e be the smallest number such that the sum of the e largest 
PiS is at least (X^ievP*) Then, the BSC pseudo-codeword 
weight of p is 



WBScip) 



2e, if Eeft = (E.eyK)/2; 
2e-l, if Eeft>(E,eyP0/2. 
The minimum pseudo-codeword weight of G denoted by 
"^min is '^he minimum over all the non-zero pseudo-codewords 
of G. The parameter e = \{wbsc{p) + 1) /2] can be inter- 
preted as the least number of bits to be flipped in the all- 
zero-codeword such that the resulting vector decodes to the 
pseudo-codeword p. (See e.g. [27] for a number of illustrative 
examples.) 

Remark: Feldman et al. in [4] defined weight of a pseudo- 
codeword, the fractional distance and the max-fractional dis- 
tance of a code in terms of the projected polytope Q (the 
interested reader is referred to [4] for explicit description of 
Q). To differentiate the two definitions, we term the "weight" 
defined by Feldman et al. as fractional weight and denote it by 
Wfrac- For a point f in Q, the fractional weight of f is defined 
as the Ll-norm, i«/rac(f) = Eiev /« max-fractional 
weight of f is defined as the fractional weight normalized by 
the maximum fi value i.e.. 



'^max — f rac (f ) 



Wfracjf) 

max, fi 



Also, if Vq denotes the set of non-zero vertices of Q the 
fractional distance dfrac of the code is defined as the minimum 
weight over all vertices in Vq. The max-fractional distance 



^frac 



of the code is given by 



dfZ^ = mill 
^ (f,w)eVj3,f#o 



EigV fi 

maxj fi 



It was shown in [4, Theorem 9] that the LP decoder is 
successful if at most \dfrac/2\ — 1 bits are flipped by the 
BSC, thus making dfrac a potentially useful characteristic. 
Moreover, an efficient LP-based algorithm to calculate dfrac 
was suggested in [4]. However, the error pattern with the 
least number of flips which the LP decoder fails to correct 
does not necessarily converge to the pseudo-codeword with 
fractional weight dfrac- Hence, we adopted the definition of 
the pseudo-codeword weight from [27], [13], however noticing 
that it was discussed there in a different but related context of 
the computation tree and graph covers. The advantage of our 
approach will become evident in the subsequent Sections. 
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The following Lemma gives a relation between w^^^^j and 

Lemma 1: w'gfc > 2\dfrac/i\ - 1- 

Proof: The LP decoder is successful if at most 
[(i/rac/2] — 1 bits are flipped by the BSC. So, the minimum 
number of flips in the all-zero-codeword which can cause the 
LP decoder to fail is [(i/rac/2] . If e is the minimum number of 
flips associated with the minimum weight pseudo-codeword, 
then 

Since, w'^^f^ > 2e - 1, we have w^^'J. > 2\dfrac/'^ - 1 ■ 
The above lemma can be generalized to any pseudo- 
codeword pas WBSciv) ^ 2 [ui/rac (p) /2] —1. We would like 
to point out that the Kelley and Sridhara in [13] have derived a 
similar relation between wbsc (p) and Wmax-frac{p) and that 
Sridhara in [28] observed that wb5c(p) + 1 > Wmax-fradp)- 

The interpretation of BSC pseudo-codeword weight mo- 
tivates the following definition of the median noise vector 
corresponding to a pseudo-codeword: 

Definition 4: The median noise vector (or simply the me- 
dian) M(p) of a pseudo-codeword p distinct from the all-zero- 
codeword is a binary vector with support S ~ {ii, Z2, • ■ • , *e}, 
such fliat pij,...,pi^ are the e(= [(wssc(p) + 1) /2l ) 
largest components of p. 

One observers that, C (A/(p), p) < 0. From the definition of 
wbsc{p)^ it follows that at least one median exists for every 
p. Also, all medians of p have \{wbsc{p) + 1) /2l flips. The 
proofs of the following two lemmata are now apparent. 

Lemma 2: The LP decoder decodes a binary vector with 
k flips into a pseudo-codeword p distinct from the all-zero- 
codeword iff wbsc{p) < 2fc. 

Lemma 3: Let p be a pseudo-codeword with median M{p) 
whose support has cardinality k. Then wbsc{p) G {2k — 
l,2fc}. 

Lemma 4: Let M(p) be a median of p with support S. 
Then the result of LP decoding of any binary vector with 
support S' C S and \S'\ < \S\ is distinct from p. 

Proof: Let \S\ = k. Then by Lemma 3, WBscip) 
{2k — 1,2k}. Now, if r is any binary vector with support 
5" C S, then r has at most fc — 1 flips and therefore by 
Lemma 2, wbsc{p) < 2(fc — 1), which is a contradiction. ■ 

Lemma 5: If M(p) converges to a pseudo-codeword Pm ^ 
p, then wbsc{Pm) < wbsc{p)- Also, C(Af(p),pM) < 
C(M(p),p). 

Proof: According to the definition of the LP decoder, 
C(M(p), pm) <C(Af (p),p). 

If WBScip) = 2fc, then M(p) has k flips and by Lemma 
2, wbsc{Pm) <2k = WBsc{p)- 

If WBScip) = 2k — 1, then Af(p) has k flips and 
C(A/(p),p) < 0. Hence, wbsc{Pm) < 2fc by Lemma 2. 
However, if WBsciPAi) = 2fc, then C{M{p),pm) ~ 0, 
which is a contradiction. Hence, wbsc(Pm) < 2k — I = 

WBScip)- ■ 



Definition 5: The BSC instanton i is a binary vector with 
the following properties: (1) There exists a pseudo-codeword 
p such that C(i, p) < C(i, 0) = 0; (2) For any binary vector r 
such that supp(r) C supp(i), there exists no pseudo-codeword 
with C(r,p) < 0. The size of an instanton is the cardinality 
of its support. 

In other words, the LP decoder decodes i to a pseudo- 
codeword other than the all-zero-codeword or one finds a 
pseudo-codeword p such that C(i,p) — (interpreted as the 
LP decoding failure), whereas any binary vector with flips 
from a subset of the flips in i is decoded to the all-zero- 
codeword. It can be easily verified that if c is the transmitted 
codeword and r is the received vector such that supp(c + r) = 
supp(i), where the addition is modulo two, then there exists a 
pseudo-codeword p' such that C(r, p') < C(r,c). 

The following lemma follows from the definition of the cost 
of decoding (the pseudo-codeword cost): 

Lemma 6: Let i be an instanton. Then for any binary vector 
r such that supp(i) C supp(r), there exists a pseudo-codeword 
p satisfying C(r, p) < 0. 

Proof: Since i is an instanton, there exists a pseudo- 
codeword p such that C(i,p) < 0. From Definition 2 we 
have, 

i^SUpp(i) ieSUpp(i) 

Since, supp(i) C supp(r) and pi > 0, Vi, we have 
X! P^^ X! P-"- 

i^SUpp(r) ieSUpp(r) 

thus yielding 

C(r,p) <0. 

■ 

The above lemma implies that the LP decoder fails to decode 
every vector r whose support is a superset of an instanton to 
the all-zero- codeword. We now have the following corollary: 

Corollary 1: Let r be a binary vector with support S. Let 
p be a pseudo-codeword such that C (r, p) < 0. If all binary 
vectors with support S' C S such that \S'\ = \S\ — 1, converge 
to 0, then r is an instanton. 

The above lemmata lead us to the following lemma which 
characterizes all the failures of the LP decoder over the BSC: 

Lemma 7: A binary vector r converges to a pseudo- 
codeword different from the all-zero-codeword iff the support 
of r contains the support of an instanton as a subset. 

The most general form of the above lemma can be stated 
as following: if c is the transmitted codeword and r is 
the received vector, then r converges to a pseudo-codeword 
different from c iff the supp(r + c), where the addition is 
modulo two, contains the support of an instanton as a subset. 

From the above discussion, we see that the BSC instantons 
are analogous to the minimal stopping sets for the case 
of iterative/LP decoding over the BEC. In fact. Lemma 7 
characterizes all the decoding failures of the LP decoder over 
the BSC in terms of the instantons and can be used to derive 
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analytical estimates of the code performance given the weight 
distribution of the instantons. In this sense, the instantons are 
more fundamental than the minimal pseudo-codewords [12], 
[13] for the BSC (note, that this statement does not hold 
in the case of the AWGN channel). Two minimal pseudo- 
codewords of the same weight can give rise to different number 
of instantons. This issue was first pointed out by Forney et al. 
in [27]. (See Examples 1, 2, 3 for the BSC case in [27].) It 
is also worth noting that an instanton converges to a minimal 
pseudo-codeword. 

It should be noted that finding pseudo-codewords with 
fractional weight df rac is not equivalent to finding minimum 
weight pseudo-codewords. The pseudo-codewords with frac- 
tional weight dfrac can be used to derive some instantons, 
but not necessarily the ones with the least number of flips. 
However, as dfrac provides a lower bound on the minimum 
pseudo-codeword weight, it can be used as a test if the ISA 
actually finds an instanton with the least number of flips. 
In other words, if the number of flips in the lowest weight 
instanton found by the ISA is equal to [(i/rac/2], then the 
ISA has indeed found the smallest size instanton. 

IV. Instanton Search Algorithm and its Analysis 

In this Section, we describe the Instanton Search Algorithm. 
The algorithm starts with a random binary vector with some 
number of flips and outputs an instanton. 

Instanton Search Algorithm 

Initialization (1=0) step: Initialize to a binary input vector r 
containing sufficient number of flips so that the LP decoder 
decodes it into a pseudo-codeword different from the all-zero- 
codeword. Apply the LP decoder to r and denote the pseudo- 
codeword output of LP by p^. 

I > 1 step: Take the pseudo-codeword p' (output of the {I — 1) 
step) and calculate its median M(p'). Apply the LP decoder 
to M(p') and denote the output by P7\/,. By Lemma 5, only 
two cases arise: 

• wbsc{Pm,) < WBSc{p')- Then p'+i = p^/, becomes 
the l-th step output/(/ + 1) step input. 

• WBSciPMi) = WBscip'')- Let the support of 7\f(p') be 
S = {ii,...,iki}- Let 5;, = S\{it} for some it G S. 
Let be a binary vector with support Si^. Apply the 
LP decoder to all r^^ and denote the it -output by p^^. 
If pij = 0,Vit, then A/(p') is the desired instanton 
and the algorithm halts. Else, p^^ 7^ becomes the l- 
th step output/(/ + 1) step input. (Notice, that Lemma 4 
guarantees that any p^^ ^ p', thus preventing the ISA 
from entering into an infinite loop.) 

Fig. 1 illustrates different scenarios arising in the execution 
of the ISA. Here, the squares represent pseudo-codewords and 
the circles represent binary vectors (noise configurations). Two 
squares of the same color have identical pseudo-codeword 
weight and two circles of the same color consist of same 
number of flips. Fig. 1(a) shows the case where a median, 
M(p'), of a pseudo-codeword p' converges to a pseudo- 
codeword of a smaller weight. In this case, p'+^ = Pa/, • 



Fig. 1(b) illustrates the case where a median, Af(p'), of a 
pseudo-codeword p' converges to a pseudo-codeword pm, 
of the same weight. Fig. 1(c) illustrates the case where a 
median, Af(p'), of a pseudo-codeword p' converges to the 
pseudo-codeword p' itself. In the two latter cases, we consider 
all the binary vectors whose support sets are subsets of the 
support set of A/(p') and the vectors contain one flip less. 
We run the LP decoder with the vectors as inputs and find 
their corresponding pseudo-codewords. One of the non-zero 
pseudo-codewords found is chosen at random as p'+^. This 
is illustrated in Fig. 1(d). Fig. 1(e) shows the case when 
all the subsets of A/(p') (reduced by one flip) converge to 
the all-zero-codeword. Ajf(p') itself could converge to p' or 
some other pseudo-codeword of the same weight. In this case, 
M (p') is an instanton constituting the output of the algorithm. 

We now prove that the ISA terminates (i.e., outputs an 
instanton) in the number of steps of the order the number 
of flips in the initial noise configuration. 

Theorem 1: wbsc{p'') and |supp(M(p'))| are monotoni- 
cally decreasing. Also, the ISA terminates in at most 2fco steps, 
where ko is the number of flips in the input. 

Proof: If p'+i = Pm,, then wbsc{p'^^) < wbsc(pO- 
Consequently, |supp(M(p'+i))| < |supp(Af (p'))|. 

If p'+i - p,„ then WBSciPtJ < 2(|supp(A//(p'))| - 
1) < WBSciP^)- Consequently, |supp(A/(p'+i))| < 
|supp(Af(p'))|. 

Since wbsc{P'') is strictly decreasing, the weight of 
pseudo-codeword at step I decreases by at least one compared 
to the weight of the pseudo-codeword at step / — 1. Since by 
Lemma 2, WBSciP^) — 2^0, the algorithm can run for at 
most 2ko steps. ■ 

Remarks: (1) By "sufficient number of flips", we mean that 
the initial binary vector should be noisy enough to converge 
to a pseudo-codeword other than the all-zero-codeword. While 
any binary vector with a large number of flips is almost 
guaranteed to converge to a pseudo-codeword different from 
the all-zero-codeword, such a choice might also lead to a 
longer running time of the ISA (from Theorem 1). On the 
other hand, choosing a binary vector with a few number of 
flips might lead to convergence to the all-zero-codeword very 
often, thereby necessitating the need to run the ISA for a large 
number of times. 

(2) Theorem 1 does not claim that the algorithm finds the 
minimum weight pseudo-codeword or the instanton with the 
smallest number of flips. However, it is sometimes possible to 
verify if the algorithm has found the minimum weight pseudo- 
codeword. Let denote the weight of the minimum weight 
pseudo-codeword found by the ISA. If = 2\dfrac/2 \ — 
1, then w'J^^^ = u;^f„c. 

(3) At some step /, it is possible to have WBSciPMi) = 
wbsc (p' ) and incorporating such pseudo-codewords into the 
algorithm could lead to lower weight pseudo-codewords in the 
next few steps. However, this inessential modification was not 
included in the ISA to streamline the analysis of the algorithm. 

(4) While we have shown that wbsc{p^) decreases by 
at least unity at every step, we have observed that in most 
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Fig. 1. Squai'es represent pseudo-codewords and circles represent medians or related noise configurations (a) LP decodes median of a pseudo-codeword 
into anotlier pseudo-codeword of smaller weight (b) LP decodes median of a pseudo-codeword into another pseudo-codeword of the same weight (c) LP 
decodes median of a pseudo-codeword into the same pseudo-codeword (d) Reduced subset (three different green circles) of a noise configuration (e.g. of a 
median from the previous step of the ISA) is decoded by the LP decoder into three different pseudo-codewords (e) LP decodes the median (blue circle) of a 
pseudo-codeword (low red square) into another pseudo-codeword of the same weigh (upper red squai'e). Reduced subset of the median (three configurations 
depicted as green circles are all decoded by LP into all-zero-codeword. Thus, the median is an instanton. 



cases, it decreases by at least two. This is due to the fact that 
the pseudo-codewords with odd weights outnumber pseudo- 
codewords with even weights. As a resuh, in most cases, the 
algorithm converges in less than fco steps. (For illustration of 
this point see example discussed in the next Section.) 

(5) At any step, there can be more than one median, and 
the ISA does not specify which one to pick. Our current 
implementation suggests to pick a median at random. Also, the 
algorithm does not provide clarification on the choice of the 
pseudo-codeword for the case when more than one noise con- 
figurations from the subset converge to pseudo-codewords 
distinct from the all-zero-codeword. In this degenerate case, 
we again choose a pseudo-codeword for the next iteration 
at random. Note that one natural deterministic generalization 
of the randomized algorithm consists of exploring all the 
possibilities at once. In such a scenario, a tree of solutions 
can be built, where the root is associated with one set of 
initiation flips, any branch of the tree relates to a given set 
of randomized choices (of medians and pseudo-codewords), 
and any leaf corresponds to an instanton. 

V. Numerical Results 

In this Section, we present results illustrating different 
aspects and features of the ISA. We use the [155, 64, 20] 
Tanner code [26] for illustration purposes. We begin with an 
actual (and rather typical) example. The reader is advised to 
follow this example with an eye on Fig. 2. 

Example 1: The algorithm is initiated with a binary vector r 
whose support set has cardinality 12. In this case, r converges 
to a pseudo-codeword of weight 17 (Lemma 2 guarantees 
that WBSciP^) < 24). The Median M{p^) of the pseudo- 
codeword has 9 flips. A/(p^) converges to a pseudo- 
codeword p7\/^ of weight 11, marked as p^, whose median 
M(p^) contains 6 flips. A/(p^) decodes to a pseudo-codeword 
Pm2 of weight 11 and hence we consider all vectors whose 
support sets consist of one flip less than in the support set of 



A/(p^). There are 6 such vectors and 5 of them decode to the 
all-zero-codeword (we do not show all the six vectors in Fig 
.2). The remaining vector decodes to a pseudo-codeword of 
weight 9, marked as p'^. The pseudo-codeword p'' has only 
one median A/(p^) which is decoded to the same pseudo- 
codeword p''. Hence, we consider all (five) vectors built 
from the median il/(p'^) removing a single flip and observe 
that the LP decoder decodes all these vectors into the all- 
zero-codeword. We conclude that the median is actually an 
instanton of size 5. 

We ran 2000 ISA trials using random inputs with fixed 
number of initiation flips. Fig. 3 shows the frequency of the 
instanton sizes for the number of initiation flips ranging from 
16 to 30. The value at zero should be interpreted as the number 
of patterns that decode to the all-zero-codeword. It can be seen 
that if the initial noise vector consists of 22 or more flips, then 
it converges to a pseudo-codeword different from the all-zero- 
codeword in all of 2000 cases. 

Note that Fig. 3 shows a count of the total number of 
instantons of the same size, so that multiple trials of ISA 
may correspond to the same instanton. To correct for this 
multiplicity in counting, one can also find it useful (see discus- 
sions below) to study the total number of unique instantons 
observed in the ISA trials, coined the Instanton-Bar-Graph. 
Fig. 4 shows the number of distinct instantons of a given size 
for 2000 and 5000 random initiations with 20 flips. One finds 
that the total number of ISA outputs of size 5 after 2000 trails 
is 720, however representing only 155 distinct instantons. In 
this case (of the Tanner code), we can independently verify ' 
that the total number of instantons of size 5 is indeed 155, thus 
confirming that our algorithm has found all the instantons of 
length 5 detecting each of them roughly 4 times. Obviously, 
the total number of distinct instantons of size 5 does not 
change with further increase in the number of trails. This 

'We have observed that all the instantons of size 5 are in fact the (5, 3) 
trapping sets described in [7]. Further investigation of the topological structure 
of instantons will be dealt in future work. 
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Fig. 2. Illustration for the example of ISA execution on the [155, 65, 20] Tanner code discussed in Section V. 



c 

§ 0.4 
« 0.2 




1 



012345678 



5> 0.6 
c 

§ 0.4 
« 0.2 h 




&> 0.6 
c 

§ 0.4 
« 0.2 




>. 0.6 
c 

§ 0.4 
CT 

2 0.2 




012345678 



10 11 12 13 14 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 



10 11 12 13 14 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Size of instanton 



0.6 

0.4 
0.2 


0.6 
0.4 
0.2 


0.6 
0.4 
0.2 


0.6 
0.4 
0.2 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 



1 2 3 4 5 6 7 



9 10 11 12 13 14 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Size of instanton 



Fig. 3. Frequency of instanton (ISA outputs) sizes for different weights of inputs. (Total length of the bars (for any of the sub-plots) should sum to one.) 
The bar-graphs were obtained by running the ISA for 2000 different random initiations with the fixed number of flips (ranging from 16 to 30). Numbers at 
zero (if any) show the frequency of patterns decoded into the all-zero-codeword. 



observation emphasizes utility of the sub-plots, with different 
number of initiations, as the comparison allows to judge the 
sufficiency (or insufficiency) of the number of trials for finding 
all the given size instantons. Extending the comparison to 
larger size (> 5) instantons, one observes that the numbers 
change in transition from 2000 to 5000 trials, thus indicating 
that the statistics is insufficient (at least after 2000 trials) as 
some of the instantons have not been found yet. 

The smallest weight instanton found by the ISA is 5. The 
accuracy of this estimate can be verified (indirectly) by finding 
the dfrac of the code. Using the method outlined in [4], we 
observed that dfrac of the Tanner code is 8.3498. This implies 
that w™gQ > 9 (by Lemma 1), which in turn implies that the 
size of any instanton cannot be less than 5. This proves that 
here 5 is, indeed, the smallest instanton size, and respective 
minimum pseudo-codeword weight is 9. Note also that the 



fractional weight of all the 155 pseudo-codewords of weight 
5 is 9.95, while the weight of the pseudo-codeword with 
the minimal fractional weight of 8.3498 is 19. The remark 
illustrates that minimality of the fractional weight does not 
imply minimality of the pseudo-codeword weight (and thus 
minimality of the respective instanton size). 

VI. Summary and Open Problems 

In this paper, we characterized failures of the LP decoder 
over the BSC in terms of the instantons and respective pseudo- 
codewords. We then provided an efficient algorithm for finding 
the instantons. The ISA is guaranteed to terminate in the 
number of steps upper bounded by twice the number of flips 
in the original input (Theorem 1). Repeated sufficient number 
of times, the ISA outcomes the Instanton-Bar-Graph showing 
the number of unique instantons of different sizes. We also 
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Fig. 4. Instanton-Bar-Graph showing the number of unique instantons of a given weight found by running the ISA with 20 random flips for 2000 and 5000 
initiations respectively. 



proved that the LP decoding of any configuration of the input 
noise which includes an instanton leads to a failure (Lemma 
7). This Lemma arguably suggests to use the Instanton-Bar- 
Graph derived with the ISA algorithm as a metric for code 
optimization. 

Finally, we conclude with an incomplete list of open prob- 
lems and directions for future research following from this 
study: 

(1) One would like to understand how to choose initiation 
of the ISA which guarantees convergence to the smallest size 
instanton. 

(2) When can one be reasonably certain that all instantons 
of a given weight are found? Or stating it differently, how 
many trials of the ISA are required to find all the instantons 
of the given size? Does the number of trials scales linearly 
with the size of the code? 

(3) We have noticed that difficulty of finding an instanton 
grows with its size. Once the ISA finds all the instantons of 
certain weight, can one optimize initiation strategy for the 
algorithm to find instantons of larger size more efficiently? 

(4) Can one utilize knowledge of the code structure (e.g. 
for highly structured codes) to streamline discovery of the 
Instanton-Bar-Graph, especially in the part related to the larger 
size instantons? 

(5) Some studies have explored connections between 
pseudo-codewords and stopping sets (see e.g. [13]). Are there 
any (similar?) relationships between trapping sets of the BSC 
(for Gallager like algorithms) and BSC-LP instantons? 

(6) Are instantons of a code performing over the BSC 
related to instantons of the same code over the AWGN channel 



(or other soft channels)? Can we use one to deduce the other? 
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